home *** CD-ROM | disk | FTP | other *** search
- <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
- <html> <head>
- <title>Readme for analog -- inclusions and exclusions</title>
- </head>
-
- <body>
- [ <a href="Readme.html">Top</a> | <a href="custom.html">Up</a> |
- <a href="alias.html">Prev</a> | <a href="args.html">Next</a> |
- <a href="map.html">Map</a> | <a href="indx.html">Index</a> ]
- <h1>Readme for
- <a href="http://www.statslab.cam.ac.uk/~sret1/analog/">analog 4.03</a></h1>
- <h2>Inclusions and exclusions</h2>
-
- After aliasing each item, analog decides whether that item is wanted or not.
- The whole line is only counted if all the items are wanted.
- Whether an item is wanted or not is determined by <kbd>INCLUDE</kbd> and
- <kbd>EXCLUDE</kbd> commands specified by the user. These commands can be used
- to exclude requests from your local users, for example, or to analyse only
- files in a subdirectory. For example
- <pre>
- HOSTEXCLUDE mycomputer.myisp.com
- </pre>
- would exclude all requests by that computer from the statistics.
- <p>
- The rule for determining whether an item is included or excluded is as
- follows. All the <kbd>INCLUDE</kbd> and <kbd>EXCLUDE</kbd> commands for that
- item are considered one by one in order, and the item is included or excluded
- according to the last command it matched. Items which don't match any of
- the <kbd>INCLUDE</kbd> or <kbd>EXCLUDE</kbd> commands are included if the first
- command was an exclusion, and excluded if the first command was an inclusion.
- For example, the configuration
- <pre>
- FILEINCLUDE /~sret1/*
- FILEEXCLUDE /~sret1/backgammon/*,/~sret1/analog/*
- FILEINCLUDE /~sret1/backgammon/*.gif
- </pre>
- would instruct the program to examine only my files, excluding my
- backgammon and analog files, but including gifs in my backgammon directory.
- On the other hand,
- <pre>
- FILEEXCLUDE /~sret1/*/img/*
- </pre>
- would analyse all files, except for images in my various directories. Note that
- inclusions and exclusions can contain any number of wildcards.
- <p>
- The full list of these commands is <kbd>HOSTINCLUDE</kbd> and
- <kbd>HOSTEXCLUDE</kbd>; <kbd>FILEINCLUDE</kbd> and <kbd>FILEXCLUDE</kbd>;
- <kbd>BROWINCLUDE</kbd> and <kbd>BROWEXCLUDE</kbd>; <kbd>REFINCLUDE</kbd> and
- <kbd>REFEXCLUDE</kbd>; <kbd>USERINCLUDE</kbd> and <kbd>USEREXCLUDE</kbd>; and
- <kbd>VHOSTINCLUDE</kbd> and <kbd>VHOSTEXCLUDE</kbd>.
- <p>
- Because the inclusions and exclusions take place <em>after</em> the aliasing,
- the name you must use is the aliased name. (In the absence of
- <kbd><a href="alias.html#OUTPUTALIAS">OUTPUTALIAS</a></kbd> commands, this is
- the name of the item in the output.)
- <p>
- Sometimes a line doesn't contain a particular sort of item, either because
- there is no field reserved for it on the line, or because the browser didn't
- send it for that request. You can include or exclude these lines by making a
- special blank entry in the <kbd>INCLUDE</kbd> or <kbd>EXCLUDE</kbd>
- command. For example,
- <pre>
- USERINCLUDE jim
- USERINCLUDE ""
- </pre>
- would include lines from user <kbd>jim</kbd> and lines without any user
- specified.
- <p>
- The behaviour of <kbd>REQINCLUDE</kbd> and <kbd>REFINCLUDE</kbd> can be
- slightly unintuitive if the file has <a href="args.html#unintuitive">search
- arguments</a>.
- <p>
- <a name="incregexp">On suitable operating systems</a>, you can use regular
- expressions for the inclusions and exclusions by prefixing the expression with
- "<kbd>REGEXP:</kbd>" or "<kbd>REGEXPI:</kbd>". I've
- already described this at length in the context of aliases, so you can
- <a href="alias.html#aliasregexp">look there</a> for all the details.
- <p>
- If you get confused with all the inclusions and
- exclusions, remember that you can always run <kbd>analog -settings</kbd>
- to see what the options you have specified represent.
- <hr>
- <a name="FROMTO">There is also</a> one other pair of commands which belongs in
- this category,
- namely the <kbd>FROM</kbd> and <kbd>TO</kbd> commands. These specify a time
- period to restrict the analysis to. The simplest usage of these commands is
- <kbd>FROM yyMMdd</kbd> or <kbd>FROM yyMMdd:hhmm</kbd>, where <kbd>yy</kbd>
- represents the last two digits of the year (analog assumes that the year is
- between 1970 and 2069), <kbd>MM</kbd> represents the month,
- <kbd>dd</kbd> is the date, <kbd>hh</kbd> the hour, and <kbd>mm</kbd> the
- minute. So, for example, to analyse only requests from
- July 1999 to June 2000 I would use the configuration
- <pre>
- FROM 990701
- TO 000630
- </pre>
- Alternatively, each of the components can be preceded by <kbd>+</kbd> or
- <kbd>-</kbd> to represent time relative to the time at which the program was
- invoked. In this case, the date can have more than 2 digits. This allows
- constructions like
- <pre>
- FROM -01-00+01 # from tomorrow last year
- TO -00-0131 # to the end of last month (OK even if last month
- # didn't have 31 days)
- FROM -00-00-112
- TO -00-00-01 # statistics for the last 16 weeks
- FROM -00-00-00:-06+01 # statistics for the last 6 hours
- </pre>
- There are command line abbreviations <kbd>+F</kbd> and <kbd>+T</kbd>
- for the <kbd>FROM</kbd> and <kbd>TO</kbd> commands; for example,
- <kbd>+T-00-00-01:1800</kbd> looks at statistics until 6pm yesterday.
- <kbd>-F</kbd> and <kbd>-T</kbd> turn off the from and to, as do <kbd>FROM
- OFF</kbd> and <kbd>TO OFF</kbd>.
- <hr>
- <a name="outputexcludes">There are also</a> <kbd>INCLUDE</kbd> and
- <kbd>EXCLUDE</kbd> commands for most of
- the reports. These exclude individual lines from particular reports. So, for
- example, the command
- <pre>
- REFREPEXCLUDE http://your.site.com/*
- </pre>
- would exclude your internal referrers from the Referrer Report. However, it
- would not exclude them from the Failed Referrer Report, the Referring Site
- Report, etc. (you need to use <kbd>FAILREFEXCLUDE</kbd>,
- <kbd>REFSITEEXCLUDE</kbd> etc. for that); nor would it prevent other analysis
- of logfile lines with those referrers, as <kbd>REFEXCLUDE</kbd> would. Also
- <kbd>REFREPEXCLUDE</kbd> would include the referrers in the "not
- listed" line at the bottom of the report.
- <p>
- The full list of these commands is <kbd>REQINCLUDE</kbd> and
- <kbd>REQEXCLUDE</kbd>; <kbd>REDIRINCLUDE</kbd> and <kbd>REDIREXCLUDE</kbd>;
- <kbd>FAILINCLUDE</kbd> and <kbd>FAILEXCLUDE</kbd>; <kbd>TYPEINCLUDE</kbd> and
- <kbd>TYPEEXCLUDE</kbd>; <kbd>DIRINCLUDE</kbd> and <kbd>DIREXCLUDE</kbd>;
- <kbd>HOSTREPINCLUDE</kbd> and <kbd>HOSTREPEXCLUDE</kbd>; <kbd>DOMINCLUDE</kbd>
- and <kbd>DOMEXCLUDE</kbd>; <kbd>ORGINCLUDE</kbd> and <kbd>ORGEXCLUDE</kbd>;
- <kbd>REFREPINCLUDE</kbd> and
- <kbd>REFREPEXCLUDE</kbd>; <kbd>REFSITEINCLUDE</kbd> and
- <kbd>REFSITEEXCLUDE</kbd>; <kbd>SEARCHQUERYINCLUDE</kbd> and
- <kbd>SEARCHQUERYEXCLUDE</kbd>; <kbd>SEARCHWORDINCLUDE</kbd> and
- <kbd>SEARCHWORDEXCLUDE</kbd>; <kbd>REDIRREFINCLUDE</kbd> and
- <kbd>REDIRREFEXCLUDE</kbd>; <kbd>FAILREFINCLUDE</kbd> and
- <kbd>FAILREFEXCLUDE</kbd>; <kbd>BROWSUMINCLUDE</kbd> and
- <kbd>BROWSUMEXCLUDE</kbd>; <kbd>FULLBROWINCLUDE</kbd> and
- <kbd>FULLBROWEXCLUDE</kbd>; <kbd>OSINCLUDE</kbd> and <kbd>OSEXCLUDE</kbd>;
- <kbd>VHOSTREPINCLUDE</kbd> and
- <kbd>VHOSTREPEXCLUDE</kbd>; <kbd>USERREPINCLUDE</kbd> and
- <kbd>USERREPEXCLUDE</kbd>; and <kbd>FAILUSERINCLUDE</kbd> and
- <kbd>FAILUSEREXCLUDE</kbd>. The inclusion or exclusion applies to the
- unaliased name, if you are doing any <a href="alias.html#OUTPUTALIAS">output
- aliases</a>.
- <p>
- <!-- not just in output IN/EXCLUDEs, although the layout of this text might -->
- <!-- imply that so as to present REQINCLUDE pages in the right place -->
- You can also use the symbolic word <kbd>pages</kbd> in suitable
- <kbd>INCLUDE</kbd> and <kbd>EXCLUDE</kbd> commands; one very common command is
- <pre>
- REQINCLUDE pages
- </pre>
- to include only pages in the request report.
- <hr>
- <a name="PAGEINCLUDE">Analog determines</a> which files should count as pages
- (and thus which requests
- count as page requests) using another <kbd>INCLUDE</kbd>/<kbd>EXCLUDE</kbd>
- pair, called <kbd>PAGEINCLUDE</kbd> and <kbd>PAGEEXCLUDE</kbd>.
- By default, <kbd>*.html</kbd>, <kbd>*.htm</kbd> and directories (<kbd>*/</kbd>)
- count as pages. But you change the list by commands like
- <pre>
- PAGEINCLUDE *.ps,*.ps.gz
- PAGEEXCLUDE /sret1.html
- </pre>
- I.e., Postscript and gzipped Postscript are pages, but <kbd>/sret1.html</kbd>
- isn't. (If the file has <a href="args.html">search arguments</a>, the
- <kbd>PAGEINCLUDE</kbd> and <kbd>PAGEEXCLUDE</kbd> are reckoned just on the
- part of the filename before the question mark.)
- <hr>
- <a name="LINKINCLUDE">There is one more</a> set of <kbd>INCLUDE</kbd> and
- <kbd>EXCLUDE</kbd> commands which I'll describe now. In the Request Report
- and the three referrer reports (Referrer Report, Redirected Referrer Report
- and Failed Referrer Report), analog can link to the files which it's
- listing. There are commands <kbd>LINKINCLUDE</kbd> and <kbd>LINKEXCLUDE</kbd>
- for the Request Report, and <kbd>REFLINKINCLUDE</kbd> and
- <kbd>REFLINKEXCLUDE</kbd> for the referrer reports, to specify exactly which
- files are linked to. So, for example,
- <kbd>
- REFLINKINCLUDE pages
- </kbd>
- would link to pages in the three referrer reports.
- <hr>
- There is one final set of <kbd>INCLUDE</kbd> and <kbd>EXCLUDE</kbd> commands
- to include or exclude the search arguments at the end of URLs. But there are
- some slightly complicated issues surrounding those, so they deserve a
- <a href="args.html">new section</a>.
- <hr>
- <address><a HREF="http://www.statslab.cam.ac.uk/~sret1/">Stephen Turner</a>
- <br>Need help with analog? <a href="mailing.html">Subscribe to the analog-help
- mailing list</a>
- </address>
- <p>
- [ <a href="Readme.html">Top</a> | <a href="custom.html">Up</a> |
- <a href="alias.html">Prev</a> | <a href="args.html">Next</a> |
- <a href="map.html">Map</a> | <a href="indx.html">Index</a> ]
- </body> </html>
-